Goto

Collaborating Authors

 label model


Learnability with Partial Labels and Adaptive Nearest Neighbors

Errandonea, Nicolas A., Mazuelas, Santiago, Lozano, Jose A., Dasgupta, Sanjoy

arXiv.org Machine Learning

Prior work on partial labels learning (PLL) has shown that learning is possible even when each instance is associated with a bag of labels, rather than a single accurate but costly label. However, the necessary conditions for learning with partial labels remain unclear, and existing PLL methods are effective only in specific scenarios. In this work, we mathematically characterize the settings in which PLL is feasible. In addition, we present PL A-$k$NN, an adaptive nearest-neighbors algorithm for PLL that is effective in general scenarios and enjoys strong performance guarantees. Experimental results corroborate that PL A-$k$NN can outperform state-of-the-art methods in general PLL scenarios.





Mitigating Source Bias for Fairer Weak Supervision

Neural Information Processing Systems

Theoretically, we show that it is possible for our approach to simultaneously improve both accuracy and fairness--in contrast to standard fairness approaches that suffer from tradeoffs. Empirically, we show that our technique improves accuracy on weak supervision baselines by as much as 32% while reducing demographic parity gap by 82.5%.



DP-SSL: TowardsRobustSemi-supervisedLearning withAFewLabeledSamples

Neural Information Processing Systems

However, when the size of labeled data is very small (say a few labeled samples per class), SSL performs poorly and unstably, possibly due to the low qualityoflearnedpseudolabels.Inthispaper,weproposeanewSSLmethodcalled DP-SSL that adopts an innovative data programming (DP) scheme to generate probabilistic labels for unlabeled data. Different from existing DP methods that rely on human experts to provide initial labeling functions (LFs), we develop a multiple-choice learning (MCL) based approach to automatically generate LFs fromscratchinSSLstyle. Withthenoisylabelsproduced bytheLFs,wedesign a label model to resolve the conflict and overlap among the noisy labels, and finally infer probabilistic labels for unlabeled samples.